Skip to content

Conversation

Copy link

Copilot AI commented Jan 1, 2026

Analyzed architectural requirements for operating behind a load balancer given SQLite's single-writer constraint. Evaluated 6 SQLite replication solutions; all require significant tradeoffs (API migration, vendor lock-in, or still single-writer). Recommend guild-based pod sharding instead.

Proposed Architecture

Split by Discord's natural boundary—guilds. Each gateway pod handles a subset of guilds with its own SQLite database. Config service tracks assignments.

Load Balancer
  → HTTP Service (stateless, HPA: 2-10 pods)
    → Config Service (PostgreSQL, 2 pods)
      → Gateway Pods (stateful, 3-10 pods, each with SQLite + Litestream)

Deliverables

Documentation (/notes/)

  • LOAD_BALANCER_QUICK_REF.md - Single-page reference
  • LOAD_BALANCER_INDEX.md - Navigation guide
  • 2026-01-01_1_load-balancer-architecture.md - Full analysis
  • 2026-01-01_2_architecture-diagrams.md - Mermaid flow diagrams
  • 2026-01-01_3_sqlite-sync-comparison.md - Evaluation: Litestream, LiteFS, rqlite, Turso, Marmot, Dqlite
  • 2026-01-01_4_implementation-guide.md - Phase-by-phase code examples
  • 2026-01-01_5_executive-summary.md - Cost/ROI/timeline
  • 2026-01-01_6_ascii-diagrams.md - Request flows, failure scenarios

Kubernetes Manifests (/cluster/proposed/)

Production-ready configs for all components with HPA, PDB, Litestream sidecars, and ingress routing.

Key Decisions

Component Solution Rationale
Scaling strategy Guild sharding Works with existing code, no SQLite migration
Backup Litestream Low overhead, battle-tested, point-in-time recovery
Config store PostgreSQL Small dataset, multi-writer capable
HTTP/Gateway split Separate services Independent scaling, clear failure domains

Trade-offs

  • Cost: $10/mo → $45-50/mo (5x)
  • Complexity: +2 services, +guild routing logic
  • Operations: Guild reassignment on scaling
  • Benefit: Horizontal scaling, 99.9% uptime, zero-downtime deploys

Implementation Path

6-8 weeks across 5 phases: config service → gateway filtering → HTTP routing → Litestream integration → production deployment.

SQLite remains viable at scale with proper partitioning. No vendor lock-in, minimal code changes.

Original prompt

Analyze the codebase and chart out the necessary changes to permit this service to operate behind a load balancer. The project uses SQLite, which is a major constraint. There exist psqlite sync tools, include those in your report. One clear option is to split traffic by guild, but that would require some ability to reshuffle which pods are run with which guild ids. Include an architectural diagram for any k8s revisions; it may make sense to separate out config services separately from the pods that connect to gateways and bind to http


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 4 commits January 1, 2026 21:01
Co-authored-by: vcarl <1551487+vcarl@users.noreply.github.com>
Co-authored-by: vcarl <1551487+vcarl@users.noreply.github.com>
Co-authored-by: vcarl <1551487+vcarl@users.noreply.github.com>
Co-authored-by: vcarl <1551487+vcarl@users.noreply.github.com>
Copilot AI changed the title [WIP] Analyze codebase for load balancer compatibility Architecture analysis: Load balancing with SQLite constraint Jan 1, 2026
Copilot AI requested a review from vcarl January 1, 2026 21:08
@vcarl
Copy link
Member

vcarl commented Jan 2, 2026

claude ✨

Key Finding

The proposed manifests assume code changes that don't exist. The app is monolithic - there's no SERVICE_MODE support, no config service integration, and no guild-to-pod routing.

Risk/Reward Tiers (Safe Wins First)

Tier Change Risk Code Changes Deploy Now?
1 Litestream backup sidecar Low None Yes
1 Ingress rate limiting/timeouts Very Low None Yes
2 PostgreSQL (isolated) Low None Yes (prep)
3 SERVICE_MODE support Medium Required After code
3 Config service Medium-High New app After code
3 HTTP service separation Medium Required After code
4 Gateway StatefulSet + sharding High Extensive Last

Recommended First Steps

  1. Litestream sidecar - Add to existing StatefulSet today. Zero app changes, immediate disaster recovery.
  2. Ingress enhancements - Add timeout/rate-limit annotations. 5-minute change.
  3. PostgreSQL - Deploy isolated, ready for when config service code exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants